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Introduction 



When we think of persistent barriers to educational equity, bias in research does not 
come readily to mind. In fact,itrarcly comes tomindatall. Yetitshould. Theiesults 
of educational research strongly influence education. Much of what children read 
and how they are taught is based on research, and much research— past and 
current— is biased. Societal values, including attitudes about women and men, 
peopleof odor and whites, can affeclall members of society , including researchers; 
biased values can cause research results to be incomplete, exaggerated, and, in many 
cases, just plain wrong. 

Research is used to develop and evaluate educational programs and materials, 
but if the research is biased, the information used in decision making is probably 
wrong. We need to be aware that race and s-^x bias can affect research, learn how 
to det^mine if bias is present, and, if it is, know how to minimize its effects. 

"We** includes everyone involved in education— teachers, parents, administra- 
tors, counselors, and even students. Since all of us use, or are affected by, 
educational research results, we all need to know more about research and its 
strengths and weaknesses. 

This monograph examines educational research and some of the myths, that 
surround it. Itcovers many ways thatbias can affect research,including its influence 
on who or what is studied, how the study is done, and what conclusions arc drawn. 
Using examples from the past and the present, this monograph examines some 
effects of biased research and concludes with suggested guidelines for evaluating 
ref ^arch and '"next steps" to reduce the incidence and effects of bias on research. 
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Educational Research: 
Why Bother ? 



We may wonder why we should become involved with educational research. The 
role tluateachers,administrat(n,paienis, students, politicians, and even U^^ 
public play in education is clear, but the importance of educational research in 
education*)! policy and practice is vnovt difficult to see. Yet despite the apparent 
invisibility of educational research, its impact is pervasive. 

Resi^arch results influence what is taught, how it is taught, and even the books 
and materials used G^ational Academy of Education, 1984). Research helps us 
assess everything from the lelative effectiveness of different teaching methodolo- 
gies to the relative value of computer-assisted instruction and computer simula- 
tions. For example, thanks to educational research, we have discovered the 
foU^wing: 

• Class size is important. Students in very small classes do noticeably better 
than students in classes of "average** size, while students in large classes do 
slightly worse. 

• Childien learn nuith and science best when they use physical objects and do 
hands-on experiments. 

• Agood preschool experience hasa strongpositive influence on the long-term 
development of children at risk, particularly boys. 

• Nonviolent methods of discipline, such as time out, are effective alternatives 
to corporal punishment. 

• Computer-assisted instruction is an effective tool for basic skills remedia- 
tion. 

As the National Academy of Education concluded, "the findings of educational 
research generate principles and precq)ts for educational innovation and profes- 
sional practice** (1984, p. 1), Good research can make education more effective 

Most of us have had little experience with research other than wha: we may 
have learned in an undergraduate or graduate course. 'Hiis lack of experience 
contributes to a view of research as infallible and mysterious. Research is 
considered sacrosanct, rather than something to be evaluated and eiiher accepted or 
rejected. Unless proven otherwise (and sometimes not even then), research is 
viewed as scientific, objective, valid, and good. 

However, research, like everything else, can be bad or good, subjective or 
objective, accurate or wrong, (^onskler the foUowing recent examples from 
rq)utable journals of research: 
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• Even though invalid, unreliable research is inaccurate research, Bordelon 
concluded, in a 1983 article published in The Reading Teacher, that "while 
some of the research does not follow good scientific principles for reliability 
and validity, much of it can help the classroom teacher.** (p. 796) 

• Without evidence, Collis, in a 1985 International Journal of Women's 
Studies article on sex differences in computer use, concluded that "females 
in this study appear to be their own enemies.** (p. 213) 

• Even though results were the same for disabled and able-bodied preschool- 
ers, a 1 983 Journal of Educational Research article by Fuchs et sd. reported 
only the results for disabled preschoolers. 

The factors that can cause research to be inaccurate or incomplete can be 
divided into two major categories: traditional sources of invalidity and societal 
biases. 

Learning the traditional sources of invalidity and how to combat them is an 
important part of training researchers. Readm of research, too, need to be aware 
of these sources of invalidity, which include poor measurement techniques, a poor 
research plan, failure to account for differences among groups being studied (e.g., 
one group is older), and incorrect statistics. The Appendix, "A Beginner*s Guide 
to Educational Research,** provides an explanation of sources of invalidity as well 
as an overview of the basic components of research. 

An examination of societal bias is not typically a part of the training that 
researchers or readers of research receive, even though the impact of bias can be 
devastating. Biased attitudes and stereotypic ideas about groups of people, based 
on their race, sex, or cultural background, greatly affect research. 

Researchers arc not immune to the influence of American racism (Thomas and 
Sillen, 1972); neither arc they exempt from societal beliefs about the differences 
between women and men. Philosophers of science have discovered that scientific 
objectivity quickly becomes subjective under the influence of intense feelings 
G^agel, 1961). As Russell concluded: "As soon as any strong passion intervenes to 
warp the expert*s judgment, he [sic] becomes unreliable, whatever scientific 
equipment he may possess** (1959, p. 276). 
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Demystifying Research: 
Exploding Some Myths 



Before using any research, readers must have confidence in the quality of the 
research and in their own ability to assess that quality. Research is neither inf ailible 
nor inconopiehensible; you do not have to be a statistician to ev{ luate it. If a piece 
of research doesn't nnake sense to you, it may be because the research just doesn't 
niake sense. At the same time, this doesn't mean that research can, or should, be 
evaluated from a position of ignorance. There are some basics, such as those 
c overed in the Appendix, that people need to know i n order to begin to understand 
and evaluate research. To demystify research, people need to gain familiarity with 
research tenns, methods, strengths, and weaknesses, and they need to use their 
understanding to assess all research. 

An important step in the demystification process is to examine some of the 
myths that surround research and researchers, including the following: 

I. If it's been published, it must be true. 

The publishing process weeds out some, but not all, bad research. In 
addition, studies finding differences between groups are more apt to be 
published than those finding no differences. '"Let the buyer beware" also 
applies to research. 

n. Research results remain true over time. 

While some results remain true through the years, many do not. Tlie 
changes of the past twenty to thirty yenrs can mean that much of what we 
once "knew** no longer holds true. This is particularly true in education. 
As expectations and educational environments change , so do characteris- 
tics of students and teachers. 

ni. Sex and race differerices exist in education. 

There are few cognitive differences between girls and boys. Where 
differences exist, they are much smaller than the diffmnces found within 
groups of boys or groups of giris. This is true of comparisons by race as 
well. In comparisons by race and by sex, **wUhin group" diffwences are 
always greater than **between group** differences. Statements about the 
"average** giri and the "average** boy are powaful but misleading. In 
addition, such statements can become self-fulfilling, with the reader 
confusing the average with the individual. 
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IV. The beginning and end are the only important parts of a research study. 

While the introduction and the conclusions may be the most interest- 
ing and readable sections, they do not provide enough information to 
assess the quality of the research. Only by reading the whole study- 
including its design and results— can one determine whether the conclu- 
sions are backed up by what actually happened. 

V. Research and researchers arc "objective,** uninfluenced by societal values 
or their own view of the world. 

While most researchers try to keep their interests and biases removed 
from their research, doing so is very difficult, if not impossible. As people 
from mystery writer Amanda Cross to philosopher Thomas Nagcl have 
concluded, researchers are not immune to the influences of the worid 
around them : "Kate nutfveled, not for the first time, at the ease with which 
academics deserted the cause of scholarly disinterest when their own most 
cherished opinions were at stake" (Cross, 1984, p. 67); "It is not easy. . . 
to prevent our likes, aversions, hopes and fears from coloring our conclu- 
sions" (Nagel, 1961, p. 488). In addition, researchers are often unaware 
of their biases, particularly when they are reinforced by society. 

VI. I could never understand research. 

With a little knowledge of statistics, interested people can use their 
proUem-solving and critical-thinking skills to understand and assess most 
educational research. 
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Stereotypes and Biases 
in Research 



Of aU the myths cited in the previous chapter, the myth that research and researchers 
are always objective is perhaps the most difficult to ref uts. We, as a society, have 
invested science and research with an aura of truth. It is difficult to accq)t that 
research and researchers are influenced by the world around them, including its 
biases and stereotypes. 

Given the strength and pervasiveness of societal attitudes, we all hold some 
biased ideas. Stereotyping— the assigning of traits and abilities to people based on 
their sex, race, or cultura! background— is a fast but generally inaccurate way of 
categ(mzing fGoplt. Seeing a tall, lean, young Black man, people often think 
"basketball play^; seeing a girl climbing trees, they think ^lomboy.** In research 
as in life, stereotypes sometimes are aa urate but more often are not 

As a society and as individuals, we held a variety of preconceptions, many of 
whk:h are based on people's race and sex. Even before a child's birth, our 
e7^)ectations of the fetus are based on its perceived sex rather than on individual 
differences. Many people, for example, still believe that if a fetus kicks a lot, it's 
a boy, whereas if it's quiet, it's a girl. 

Expressions such as*'she thinks likea man** and "don't worry your pretty little 
head** indicate society's view of girls and women and their intellectual abilities and 
intei^. Complimenting someone by saying "she thinks like a man** implies that 
women and men have different thinking processes and that men's thinking proc- 
esses are better. "Don't worry your pretty little head** implies that women's 
reasoning processes are different, less logical, and less serious than men's. 

The situation is similar with respect to race. ExpressicHis such as '*Latin lover** 
and myths about Black sexual prowess reflect societal attitudes about people of 
ccior. They combine with myths about rhythm and a "hi^py-go-lucky** people to 
reinforce an image of Black people as less serious, less intellectual, and more 
"eaithy** than whites. 

While the ' ' ments above may be considered "just expressions,** they are 
expressions that reflect societal racism and sexism. They are also expressions that 
set iq> (and reinforce) expectations and influence research. For example, the 
expressions found in the preceding paragraphs contribute to many researchers* 
expectations that sex and race di^erences will be found in studies of intellectual 
areas. As researcher Stq>hen Issac cautions: 

What the researcher expects to see, where he [sic] directs his attention, what he ignores 
or forgets, what he remembers or records, and even the way he interacts With subjects 
to alter their own expectations and motivational states, all can influence the results to 
fit his preconcq>tions. (1975, p. 38) 
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In other words, his (or her!) iHtconcepdons can affect what is studied, how it is 
studied* and what is concluded about the results. 

Bias and What Is Studied 

Traditionally, people of color and women of all races have not been considered as 
important as white men, and less research has been done using them as subjects. For 
example, as recently as 1980 theeditors of thef/omieJOoikonAd^/tf^^^ 
announced that there was not enough research on adolesc. nt women to warrant a 
chapter, even a short one (Gilligan, 1986). 

yfhi^n research has been done on groups other than white men, it has frequently 
focused on women and people of color as deviant or as victims. For example, 
researchers have asked: 

• "Are women feminizing our schodS?" (Sexton, 1969) not "Are men mas- 
culinizing our schools?** 

• "Are Black people as inielligent as white peq)le?*' (Bettleheim and Janow- 
itz, 1964) not "Are our intelligence tests biased?** 

• "What are the negative effects of maternal employment on children's 
academic achievement?** (U.S. Department of Education, 1983) not "What 
are the positive effects of increased family income and maternal intellectual 
stimulation on children's academic achievement?" 

J ust as "the lake in which one decides tofish predetermines the kind of fish oat 
wiH catch** (Stetson, 1982, p. 65), the questions one asks influence the answers one 
gets. 

Societal attitudes and bias determine an issue's "cultural significance,** which 
in turn affects the value given to different research topics. For example, research- 
ers working on topics related to women and people of color rqport the need to 
"balance** their work with research on more highly valued areas (Campbell, 1980). 
Reflecting t^e experiences of many researchers, Steinem concluded: 

My own work on theories of gender-btsed power was icademically suspect as single- 
ftctor analysis while my n^ghbor 's work on one men's military acts during one decade 
was thoughtful, scholarly and basic. (1980, p. 98) 

For many years, topics considered "women's issues,** such as chUdbirth and 
informal support networics, were little valued and rarely examined. When these 
topics were covwcd, it was frequently from a perspective based on limited ideas of 
the roles women and men should play. For example: 



ERIC 



Stereotypes and Biases in Research 9 



• Studies of teachers who want to remain in the classroom dnredominantly 
women) as opposed to those who want to move from teaching to administra- 
tion (predominantly men) are seen as studies of deviant behavior rather than 
studies of different levels of asfnration. (Shakeshaft, 1979) 

• Tliere is much research on the problems of female-headed households and 
single-parent families, but little research on problems of two-parent families. 
(Committee on the Status of Women in Socidogy, 1980) 

• Studiesofworic, with rare excq)tions, include only paid work. The work 
done by so many women, such as unpaid housework, child rearing, and 
agricultural work, is not included in studies of the labor force or in the gross 
national product. (Oakley, 1977) 

Not surprisingly, tnas has been found in studies of people of color as wdl. 
Omission isamajor problem; asually,for example, people of cdor are not included 
in general studies. And when people of color are studied, the emphasis has been on 
them as victims or "problems.** (Blacks are the group tnosi often studied in this 
regard, followed by Latinos. Other racial groups, including Asian Americans and 
Native Americans, are rarely, if ever, studied.) In 1968, Billingsley found that 
studies of Black families displayed a selective focus on the negative aspects. In 
1972, Thomas and Sillen concluded that "seen narrowly as a victim, the Black man 
iqipears in the learned journals as a patient, a parolee, a petitioner for aid, rarely as 
a rounded human being** (p. 47). 

Litde has changed. In 1984, Mathews, after surveying the literature on people 
of cotor and mathematks, reported that '*the emphasis has been on minorities that 
have been unsuccessful in math — on why minorities don*t enroll in math rather than 
why thosewhocontinuedo**(p. 170). Shefoundlitllefocuson successful Black and 
Latino math students, and no comparisons of successful and unsuccessful math 
students of color. 

The situation is even more bleak regarding research on both race and sex. 
Rqxms Scott "One is almost overwhelmed with the . . . intellectual void that exists 
among social science scholars concerning the life expmences of Black women** 
(1982, p. 8S). And even less data have been compiled about women and girls from 
other racial groiq)s (Scott-Jones and Clark, 1986). 

Since research topics are determined in large part by societal ideas of what's 
important and what's not (or what's right and what's not), systematic gaps in the 
educational knowledge base remain and grow. 

Bias asid Previous Research 

New research builds on previous research and theorj'. However, much previous 
research used men and the male experience as norms afrtuiist which all experience 
was assessed. Consider Kohlberg's widely accepted theory of moral develq)ment 
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His six stages of motal devek^men: were empirically derived from a longitudinal 
study (one done over a period of years) of eighty-four boys in the United States. 
Even though the group studied was limited, the stages were said to be universal. In 
addition, boys and men were generally found to be at higher levels of moral 
development than women and girls were (Kohlbeig and Kramer, 1969). This 
finding was seen as a problem not with tlie stages but with the women and girls. 
When Gilligan (1980) studied moral development in terms of women's livesrather 
than iryii.^ to evaluate those lives in terms of male-based models, she developed a 
theory very different from Kohlberg's. 

Models based on half (or less than halQ of the human race are, by definition, 
incomplete. Achievement motivation, like moral development, is an area in which 
a theory developed and tested on men and loys was used to explain the behaviors 
of female;^ as well (Atkinson, 1958). Almost twenty years passed befcne there was 
an efi(»t to lock at the theory In light of the realities of women's and girls' 
achievement iPotivation (McCleUand, 197S). Yet desfrite the growing awareness 
that any theory ba^ on males alone does itot reflect the total human experience, 
there is still a general assumption that theories based on whites reflect the total 
hunum experience. When th^ theories are used as the basis for further research, 
their inherent bias is perpetuated. 

If belief in a theory is strong enough, it takes a lot to shake it. For example, 
although for many years Cyril Burt was the "guru** of woik on the genetic basis of 
intelligence, by 19S8 many were aware that Burt had falsified his data in order to 
support his theory that heredity detennines intelligence. Neverthele^.s, Burt's work 
continued to be cited in almost every p&ychok>gy textbook published over the next 
twenty years (Heamshaw, 1979; Lewontin, Rose, and Kamin, 1984). 

Strong opinions about women and iren, about peofde of color and whites, can 
result in theories flexible enough to support those biases. The tq>ic of sex 
differences in brain hemi^eres, in "brain latmlization,** is but one example. One 
researcher concluded that where men tend to show greater lateralization, such as in 
spatial skills, greater lateralization seems to be correlated with greater ability. 
However, she abo concluded that when women stiow greater lateralization, then 
thatsame greater lateralization may hecerrelated widi lessability (Witelson, 1978). 

Sometimes it appears that the old saying "don' t confuse me with facts** should 
be changed to "don't wony, with my theory I can explain away facts.^' 

Bias and How Research Terms Are Defined 

One major way that bias influences research is in how terms are defined. Nowhere 
is this more true than in definitkms of race. There are two majordefinitions of race — 
social and biological. A social definition of race is based on societal percq)tions; 
in other words, if the society views you as Black, thai you are. A biological 
definition of race is based on genetics, on die presence or absence of biological 
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characteristics unique to aspecific race. Thesetwo' .lydifferentcoiicepts are often 
confused, and the same label is used f orpciople with v ? ry different backgrounds. For 
example, children of one white and one Slack parer^t are frequently classified as 
Black and studied as such. Hisimcally, and to a great degree even today, one is 
defined as either **all white** or **not white.** There is no standard, consistent way 
to define people by race, making it very difficult to determine what racial differ- 
ences actually exist. Studies concluding a genetic cr biological basis for Black/ 
white din<erences in intelligence and achievement (e.g. » Jenson, 1 969) have used aa 
individual's self-definition of race to determine racial classification. Howev^, 
when social definitions of race are used, no concl usions about genetic or biological 
differences can be made. 

Ignoring subjects* socioeconomic status can also affect findings on race. Most 
early studies of racial differences did not control for the effects of socioeconomic 
status, even though the white sample, reflecting society in genmd, generally had a 
higher socioeconomic status than die Black sample did (Pettigrew, 1964). Differ- 
ences in behavior were, however, concluded to be racial; the influence of socioeco- 
nomic status was noteven considered (Graves and Graves, 1978; Pettigiew, 1964). 
The situation has not changed much. Even in 1986, Scott- Jones and Clark reported 
that researchers were still not accounting for the impact of socioeconomic status on 
studies dealing with racial background. 

Definitions of socioeconomic status can also lead to biased research. Until the 
1970s and even occasionally today, die socioeconomic status of a woman was 
defined by that of her father (if she was single) or her husband (if she was married). 
Thus the brain surgeon married to the teicklayer and the waitress married lo die 
truck driver were considered, for research purposes, to be from die same socioeco- 
nomic level, whereas die typist married to the accountant was at a higher level. This 
aitangement made cross-groiq) comparisons of women using socioeconomic status 
very suspect (Nichols, 1978). 

Bias affects research in other ways as weU. One crucial factor is diat die 
research done on people of color and on women is frequendy not used in the design 
of odier studies. Consider, for example, die following findings: 

• Black students of low socioeconomic status were found to test better with 
Black testers, while for ottier Blade students die race of die tester made litde 
or no difference (Samuel, 1977; Satfler, 1970). Therefore, studies using 
white testers may exaggmte Black class differences. 

• Boys have been found to exhibit more antisocial behaviws when an adult is 
present; girls* behavior does not change (Caplan, 197S). Studies of behav- 
ioral sex differences diat ignore diis information can lead to inaccurate 
conclusions and a reinforcing of stereotypes about boys* behavior. 

• Boys perform better when someone is watching; girls perform better in 
cooperative situations dian in competitive ones; and both sexes tend to act 
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more stereotypically in the presence of adults (Greenberg, 1978; Maccoby 
and Jacklin, 1974). Studies with adult observers or studies using competi- 
tive testing that do not account for these tindings will be inaccurate. 

Not controlling for such flndings, particularly in studies of race or sex differences, 
means that what is being studied is more apt to be some combination of the testing 
procedure and sex and race than just sex or just race. 

Yet even today, graduate students report that their professors tell them always 
to do separate analyses by sex , without telling them of the pitfalls and complexities 
of such analyses OLipson, 1986). Analysis by sex without an awareness of the 
previous research on sex differences and without accounting fcx* other possible areas 
of difference generally leads to inaccuracies. 

Bias and Who Is Studied 

Traditionally « n^en h^ive been the population studied in research related to education 
and other social science areas. An analysis of samples in research published in 
education and social science journals found that 22 pm:ent of the articles did not 
even give the sex of those studied; almost half of the articles that did note their 
subjects' sex involved only one sex, most frequently males (Campbell, 1981). 

While the numbo* of single-sex studies has been reduced, they are still being 
done and the sex most likely to be studied is still male (Ward and Grant, 1983). In 
addition , most longitudinal studies (those that continue for a number of years) began 
by collecting information on men and boys and have not been updated to include 
women and girls. 

A somewhat different pattern emerges for people of color. While there have 
been a number of studies of "the Black expmence,** Blacks are only infrequently 
included in studies of human behavicMr (Stivers and Leckie, 1976). Other people of 
color are rarely studied at all. Most studies of human behavior do not even mention 
the racial breakdown of the subjects, even though f urth^ inquiries have found that 
such studies are generally based on white samples (Campbell, 1981). 

Aldiough most samples do not include girls and boys from a variety of racial 
aiid cultural groups, this fact has not stopped researchers from concluding that a 
study's results apply to them. Conclusions generalizing to all dyslexic children, 
when only boys were studied and no information about race was included (Frauen- 
heim, 1978), or conclusions about the gnomic progress for Blacks when only 
Black men were studied (Smith and Welch, 1986), are common. A survey rf 
educational research studies found that more than 90 percent of them overgeneral- 
ized their results (Campbell, 1981). 

Since researchers usually generalize their results to "humans,** and humans 
include both sexes from ail racial and cultural groups, one might wonder why 
researchers would use sing!e-sex or single-race samples to study the human 
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condition. Although many lesearchers don't acknowledge the bias in their samples, 
those who do give some interesting rationalizations for their choices. One 
itseaicher explained that he used only males 'lo avoid introducing an additional 
variable that might detract from individual and group examination** (Frauenheim» 
1978, p. 22). Another researcher excluded Black subjects because they were 
rejected by their peers less than white students were and because rejection was what 
was being studied (Bryan, 1976). Asian Americans, Native Americans, and to a 
lesser degree Latinos are so infrequently included in research that most researchers 
don't even bother to justify their exclusion. 

Researchers have given three major reasons for wcnking with single-sex 
subjects: (a) "scientific,** (b) practical, and (c) "extrascientific*' (Prescott and 
Foster, 1974, p. 3). "Scientific** reasons given wctb that "sex differences were 
known to exist in the phenomena and the investigator did not wish to explore them** 
and "the thoofy being studied was restricted to one sex.** The sex of those who were 
available to be studied and the need to keep the number of subjects "reasonable** 
were given as practical reasons, while the "extrascientific** reasons were that the use 
of one sex reduced the variability of die data and that the experiment "favored** the 
use of one sex. Research has not been done on why research^ ha ve chosen all- 
white samples, but it is not unreasonable to assume that the explanations would be 
similar. 

Including females as well as males and people of color as weU as whites can 
make a study more complex but also more accurate. Single-sex studies are 
occasionally necessary, as in studies of childbirth, and are sometimes understand- 
aUe, as in studies of football players (Ward and Grant, 1984). Single-race studies, 
however, are more difficult to justify . The rule should be that if the results are going 
to be applied to females ana males from different racial/cultural backgrounds, then 
the study sample should include them. Research cannot be generalized to popula- 
tions not represented in a study. 

Bias in Research Tests and Measures 

For a number of years there have been concern and debate about the impact of race 
and ethnic bias on achievement and aptitude testing. Forexample,asearly as 1931, 
studies indicated that verbal IQ tests in English were not good measures of Spanish- 
dominantorbilingualchildren(Altus, 1933). This finding was, however, generally 
igmmd. 

That tests are written by white middle-class authors and standardized and 
normed on white middle-class students but used on poor children and children of 
colcn* has long been recognized as a problem. It is, however, only recently that test 
developers and users have attempted to resolve it 

Additional issues exist For example, a 1970 study concluded that social class 
was a more important fiEK:tor than racial origin in predicting intelligence test scores 
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for white and Latino children in the United States (Christiansen and Livermore, 
1970). Cultural background, geographic isolation, and low socioeconomic status 
often combine to provide the child of color with a frame of reference very different 
from test develq)ers' (Anastasi and Cordova, 1953; Tyler and White, 1978). 

Sex bias can also affect testing. Research suggests that girls do better on test 
items dealing with "stereotypically feminine** topics than on "stereotypically 
. ^uline** items covering the same skills. In the past, tests, particularly adiieve- 
ment tests, have generally included more "stereotypically masculine" than "st^- 
otypically feminine** topics, thereby negatively influencing girls' scores (Coff man, 
1961; Donlon, 1971). Based on such variables as the skill areas tested, the context 
in which test items are set (a birthday party or a football game), and the type of test 
item (a multiple-choice or an essay question), sex differences can be created or 
eliminated (Dwyer, 1976; Campbell and Scott, 1980). 

Achievement tests still include proportionately more "stereotypically mascu- 
line** items, but many of today's standardized IQ tests have been carefully balanced 
to eliminate the sex differences found in earlier versions. Tests have been developed 
to support the assumption that females and males are equal in intelligence (Le- 
wontin. Rose, and Kamin, 1984). There has been no similar assumption about 
whites and Blacks, Native Americans, or Latinos, and therefore tests have not been 
"balanced** for these groups. 

Other examples of bias in testing include the following: 

• Personality tests can overpredict psychological problems in people of color 
who, either by circumstance or by choice, are not assimilated into the 
mainstream culture. Incorrectanswersareassumed to be based on ignorance 
of common values when they may be based on an opposition to those values 
or on differences in situations. (Cowan, Watkins, and Davis, 197S) 

• A test that measures '*need to achieve** gives pictures ro subjects asks 
them to make up stories about the pictures. Girls' lower number of 
achievement-oriented stories about females is seen as evidence of girls' 
lower need to achieve, rather than as an awareness that girls and women as 
a groiq) are not generally permitted to achieve in our society. (Kaufman and 
Richardson, 1982) 

• Self-reports of subjects' achievements and confidence do not take into 
consideration the possible effects of modesty and self-effacement; such traits 
are more likely to be taught to members of some groups (e.g., Asian 
Americans of both sexes and most women) than to others. (Kaufman and 
Richardson, 1982) 

• Ck)thing can affect testing. For example, studies of children at play might 
in reality be measuring differences in playing in dressesand playing in pants. 
(Campbell, 1981) 

• Observations CiiPl)eaffected by bias. Studiesin which some observers were 
told they were observing a little boy and others were told the same child was 
a girl found the perceived sex of the child affected the observers' response. 
(Gwwitz and Dodge, 1975; Herman and Serbin, 1977) 
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The decision as to which standardized test is used in research can influence 
what is found. Thirty years ago it was found ih£ i diffoftnccs in the intelligence test 
scores of bilingual students and Spanish-dominant students were much reduced 
when nonv^bal IQ tests were substituted for verbal IQ tests (Anastasi and Cordova, 
1953). Much more recently, we learned that sex differences in mathematics were 
significant if the Scholastic Aptitude Test Math (SAT:M) was used but were 
minimal or nonexistent if the School and College Aptitude Test-Quantitative 
(SCAT-Q) was used (Benbow and Stanley, 1983). 

Recently a major study concluded that Blacks have made great economic 
progress in the past forty years; the study was based on comparisons between the 
median income of the employed Black male and that of the employed white male. 
If measures such as per capita income, poverty rates, and employment rates were 
used— or even if Black women had been included in the study— the conclusions 
would have been quite different (Crawford, 1986; Smith and Welch, 1986). 

Bias and What We Learn from Research 

Many factors outside the research process can and do affect research results. One 
such example is the sex of the researcher. When researchers first examined 148 
studies on how people arc influenced, they concluded that women arc more easily 
influenced than men. However, when they lodced at the sex of the studies' authors, 
they found that male researchers were more apt than female researchers to find 
women mwc easily influenced. Similar results were found in studies of people's 
skiUs in understanding nonverbal behavior; female researchers were more 2q>t than 
male researchers to find women better at decoding nonverbal behavior (Eagly and 
Carli, 1981). 

In these two examples, research^ portrayed their own gender more favorably . 
One wonders what the results would be if there were equal numbers of female and 
male researchers— equal numbers of sex diffmnces favmng women and men, or 
perhaps no sex diffmnces at all? Similarly, one wonders what the results would be 
if the number rf researchers of color were equal to the number of white researchers. 

The sex of the researcher is not the only important "outside" variable. When 
the research was done may also be significant In die past twenty years, the rcles 
of women and men, people of colorand whites, have changed tremendously; so have 
the attitudes and tools of many researchers. For example, the studies of girls' and 
boys' vocational interests conducted more than ten years ago used tes^ that treated 
girls' and boys' interests and aspirations differently. One of the best-known 
vocational-interest inventories, the Kudor, provided pink answer sheets for girls 
and blue ones for boys. Care^ suggestions were detemiined by sex; careers such 
as physician, airline pilot, and veterinarian were suggested for boys, and steward- 
ess, hotel housekeeper, and nurse, for girls. Studies using these tests hold little 
relevance for today's students or counselors. 
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Theieaieother,moiecompl» examples of theeffectsofastudy'sage. A 1982 
analysis of studies of s^ differences in cognitive abilities found males generally 
outperforming females in quantitative, spatial, and articulation areas and females 
outperforming males in verbal areas. However, a strong correlation was found 
between the sex differences rqx)rted and the dates of the studies. The older the 
study, the more likely there wm to be sex diffmnces and the larger those 
differences were (Ros^thal and Rubin, 1982). 

A 1978 review of individuals' skills in '"reading** people found a similar 
correlation. The more recent the study, the more women's scores improved 
compared with men's (Hall, 1978). Thusthedateofthestudyshouldbeconsidered 
in evaluating it. 

Yet anotho* factor influences the accuracy of the research we read. The 
publication and dissemination processes favor the fmding of differences, not 
similarities. Studies finding statistically significant diffmences are more likely to 
be submitted for publication and more likely to be published. Such studies are, in 
fact, even more apt to be completed than those with no significant diffidences. 
Researchers are more thanfour times as likely to giveuponaiHoblem if ixeliminary 
woric reveals no significant differences (Greenwald, 197S; Smith, 1980). Thus a 
study finding sex or race differences is more ap^ to be published (and read) than a 
study fmding no diff^ences. 

This '"publication bias** means that ihcte is a focus on difference, on one group 
being found higher, better, or more skilled than othm. Since this is what we are 
most apt to read or hear, it is also what we tend to believe. 

The factors cited above are important, but the most influential factor is the 
attitudes of the researchiers themselves. As discussed earlier, what researchers 
believecan determine what they see and how they interpret data. An early example 
can be found in Y^es' woric on chimpanzee behavior, work often quoted as 
justification for ''natural** s^ roles. Yerices concluded that male chimps were 
naturally dominant, female chimps naturally subordinate. However, H^^hberger, 
using the same data, drew some v^ different conclusions. Speaking from the 
perspective of a female chimp, she wrote: 

When Jack takes over the food chute, the report calls it his '^natural dominance." . . . 
Whilel'mup there lordingitoverthefoodchute,theinvestigator writes down, "the male 
ten porarilydefentc her and allows her to act as if dominant over him." Can't 1 get any 
satisfaction out of my life that isn't allowed me by some male chimp, damn it (1948, 
p. 10) 

Drawing conclusions based on attitudes of what is approimate is not limited to 
studies of chimps. In 1 885 a Psychological Review aiticle concluded that whites ' 
slower reaction time (compared with Blacks' and Native Americans') was proof 
that whites were the superior group (Gossett, 1963). In 1887 Romanes found that 
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women's better reading skills indicated a lack of **dceper qualities of the mind" 
(Tolbin, 1972, p. 49). Thwe 2uc more recent examples: 

• A 1966 study of self*esteem found, to the authors' surprise, high Black self- 
esteem; they concluded, without evidence, that it was a defense mechanism 
against discrimination. (McDill, Meyers, and Rigsby, 1966) 

• Studies of sex differences were found to be more likely to use the term 
superiority when the differences were in the men's fevor than when they 
favored women. (Parlee, 1975) 

• A **classic'* study of infant behavior concluded that boys try to solve a 
problem, while girls give up. It did not mention the equally plausible 
conclusion that the boys and girls tried to solve the problem in different ways. 
(Goldberg and Lewis, 1969) 

• An analysis of studies of Blacks found that most researchers (82 pwcent) 
**blamed the victim,'* concluding that when Black/white differences were 
found, negative differences experienced by Blacks were due to the individu- 
als' shortcomings rather than suggesting other possible explanations such as 
racism. (Caplan and Nelson, 1973) 

• A 1961 analysis of studies of girls' and boys' reading skills found authors 
more ^t to conclude that a study was tainted or that a m istake had been made 
when a study did not find giris better readers than boys. (CofTman, 1961) 

If researchers have strong expectations, they may include only the results that 
supportthoseexpectationsorassumethatresults are "really" significant even when 
no significant differences are found 

Theauthorofarecentof study of sex differences inmathematicsconcludedthat 
girls are more apt than boys to make math mistakes, even though a table on the very 
same page as that statement showed no significant differences between girls and 
boys (C^lan, MacPherson, and Tobin, 1985). Similar results have been found in 
studies of sex differences in spatial skills. F6r example, males have been consis- 
tently reported as sewing higher than females in doing line mazes. Yet only 18 of 
105 studies statistically compared female and male scores, and in only 4 of the 18 
were male scores significandy higher than female scores (C^lan, MacPherson, and 
Tobin, 1985). This problem goes far beyond sex differences in mathematics. Other 
studies have contradicted their own results, concluding, for example, that people 
from father-absent homes feel more victimized and in less control (Pettigrsw, 1964) 
and that Black students arc more likely than whiles to feel teachers do not like them 
(Brown, 1967). 

It is impwtant to realize that none of these researchers felsified their results; 
they were not trying to fool us. They reported the results that did not substantiate 
their conclusions, but their conclusions were greatly influenced by their own 
attitudes and expectations. Until societal expectatfons for people of color and 
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whiles and for women and men are equal, until there is equal respect for both sexes 
of all races, then answers to questions about people of color and whites and about 
wonnen and men wUI reflect our own prejudices* 
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Historical Effects 
of Biased Research 



Since it is often easier to identify yesterday*, biases than today's, let us begin by 
looking at the effects of biased research in the past. 

InteDecl and Race 

Until very recently, researchers rq)eatedly '^proved** the intellectual inferiority of 
men of colx and all women. Measures such as brain weights, head sizes, and facial 
|mqx>rtions were used to "prove** that Anglo-Saxons woe h ighest on the evolution- 
ary ladder, foUowed by Northem Europeans, Slavs, Jews, and Italians, with Blacks 
trailing far behind. (Other racial groups were generally ignored.) This ranking 
pertained only to males; Anglo-Saxon females were conside:^ at the level of Black 
males, and few bothered even to categorize women from other backgrounds 
(Ehrenreich and English, 1979; Thomas and Sillen, 1972). 

Morton, one of the early researchers in this area, measured the capacity of a 
small number of human skulls and concluded that Blacks and Native Americans had 
smaller toun cavities than whites did and thus were less intelligent Like so many 
other researchers, Morton was so convinced of his hypotheses that he simply 
discarded the data that did not support his hypotheses: in his final calculations, 
smaller skulls belonging to people of color were included, while smaller sLulls 
belongingto whiles were 'Hhrown out** Had the smallerwhite skulls been included, 
McMlon would have found no diffmnces in skull size (Gould, 1981). 

Researchers in thisarea were very good atfinding flaws in research conclusions 
that did not support their point of view. However, they routinely ignored the very 
same taors in research results with which they did agree (Gould, 1981). 

Rom skuU size, research^ moved to brain weightasameasureof intelligence. 
Again, expectations detained results. When no differences could be found 
between brain weights of Blacks and whites, some very creative reasoning was used 
to $unx)rt the predetermined conclusion that whites were more intelligent For 
example, most studies of brain weights used the brains of unclaimed bodies. When 
the results showed no differences linked to race, one research^ ingeniously 
concluded, with no evidence at all, that only the lowest class of whites (prostitutes 
and "the de|mved*0 would become unclaimed bodies, while Blacks at all socioeco- 
nomic levels would be abandoned at death! Thus the flndings of no difference wm 
said to be based on the 'Yact** that the lowest whites were being compared with all 
Blacks and that the data "Jo perhaps show that the low-class Caucasian has a larger 
brain than a better-class Negro** (Bean, 1906, p. 409; Gould, 1981, p. 79). 

The brain weights and thus, supposedly, the intelligence of white women and 
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men were also compared. Finding women's taains smaller than men's, research- 
ers completely ignomi the reality that women arc generally smaller than men and 
concluded that women and men could not be ueated equally until their brain weights 
were the same (Tolbin, 1972). 

By the twentieth century, intelligence tests began to replace brain measure- 
ments. However, with few exceptions, the conclusions remained the same. 
Ignoring test developers' assertions that test score comparisons should be nnade 
only anKHig children from similar backgrounds, Terman concluded that, in com- 
parison with whites, a low level of inteUigence was ' very, very common among 
Spanish-Indian and Mexican families of the Southwest and also among Negroes. 
Their dullness appears to be racial''(1916, pp. 91-92). He decided that children rf 
Spuiish-Indian, Mexican, and Black parents "are uneducable beyond the merest 
rudiments of training. No amount of school instruction will ever make them 
intelligent voters or capable citizens in the true sense of the word. Judged 
psychologically, diey cannot be considered normal.** 

Terman was not a member of the Ku KIux Klan but rather a well-respected 
psychologist and educator who had a great influence on education. (Indeed, his 
yfotk on the gifted is still being used today.) Like so many othws, Terman let his 
beliefs influence his research. He did not account for the effects of any cultural 
biases in the tests he was using. Neither did he look at differences in the education 
that poor Mexicans and Blacks were getting in comparison with the education 
received by middle-class whiles. He did, however, explain away embarrassing 
excq)tions to his theories. In one study, Terman found the intelligence scores of 
hobos "distressingly high.** Since it would not do for hobos to have higher scores 
than "more respectable** people, Terman used only each group's lowest scores; the 
hobos, instead of being in die middle where they belonged, sank to Uie bottom. 

In The Mismeasure of Man, a fascinating debunking of research on intelligence 
testing, Gould concluded: 

The history of scientific views on race serves ts i minor of social movement; . . . 
reflecting good times and bad; periods of the belief in equality and of rampant racism. 
. . . Changes in research fmdings on intellectual inferiority reflect changes in society, 
witfi biological determinism rising in times c/poUtical retrenchment. (Gould, 1981, p. 
29) 

Schooling and Se:; 

Research related to women and schooling has also served as a mirror of society. 
When society wanted women in the home, research "discovered" a scientific basis 
to justify women's remaining uneducated and at home. In Sex in Education: or, A 
Fair Chance for Girls, rq)rinted for seventeen editions between 1873 and 1972, 
Clarice concluded that higher education would cause a woman 's uterus to atrophy. 
This amazing "finding" was based not on medical rqwrts but rather on data that 
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educated women were less apt to have children dum u educated women were. The 
smaller proportion ol married educated women and their gieater access to biith 
contnri were not considered as an explanation for their lower fertility rates; these 
factors weren*t even mentioned 

Femaim students were concluded to be pale, in delicate health, and '"prey to 
monstrous deviations from menstrual regularity** (Ehrenreich and English, 1979, p. 
128). When one study found proportionately more educated women than men in 
insane asylums, its authcMrs concluded that higher education was driving women 
crazy (BuUough and BuUough, 1973). Even G. Stanley Hall (190S), one of the 
founders of modem psychology « wrote that the woman who used her brain "first lost 
her mammary function** and had little hope to be other than a "moral and medical 
fieak** (Ehrenreich and English, 1979, p. 129). 

These conclusions were made by researc^rs but were not based on research. 
No contndled research woe (>one on the relationship between high^ education and 
the physical loss of mamnuury function. Neither was research done on "monstrous 
deviations fiom muistrual regularity.** 

Today, when institutions of higher education are courting women students, 
research like this would not be condoned. The picture is difTcrent in athletics, 
however, where women's role is much less assured. Recent conclusions about how 
"serious athletic training** can negatively affect future maternity sound very similar 
to past conclusions about "serious education** and maternity. The more things 
change, the more they may remain the same. 

Biased Research Conclusions: A Small Sample from the Past 

Conclusions such as those which follow were used to support policies that set up 
dua! educational systems based on race and denied higher education to women 
(CampbeU and Klein, 1982). 

[The] scientific community has been blinded to the truth (of racial intellectual inferior- 
ity] by the diq>licity of Franz Boas, Communists, Jews and Sentimentalists. (Garrett, 
1961, p. 253) 

Blacks in spite of being bereft of a moral sense do have a great compensating gift . . . 
(Tjiiey all sing. (Evarts, 1914, p. 340) 

All Negroes have a fear of daikness ... are careless, credulous, childlike and easily 
amused. (Bevis, 1921, p. 69) 

[It is] well known that among the colored race there are many women who are supremely 
endowed with almost unique emotional equipment which makes their services ideal for 
infants and young childrer. '^Gesell and Ilg, 1943, p. 273) 
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rrhe ovaries] are the rnostpowerfidigents in ill the commote system; 
...onthemrestherintellectudstaiKlinginsociety,herphysicalperfection. (Bliss, 1870, 
p. 96) 



Current Effects 
of Biased Research 



While it may not be difTiciilt to see bias in historical research, it is often difficult to 
see it in today's studies. Yet societal biases are still alive and well and influencing 
educational research and policy. The following examples are but two of the 
educational areas in which societal biases have greatly influenced research. 

Student Interactioii 

More than thirty years ago, ''sq;>ar&te but equal** education was declared unconsti- 
tutional. Although education is still not fully integrated, the 1954 ruling did cause 
educational researchers to focus on integration and on interracial interaction among 
students (i.e., the degree to which students of color and white students would talk 
together* work together, and gently become friends). This work sought to find 
ways of reducing "racia^ isolation,** and the results were used to devek>p multi- 
million-dollar programs to encourage or facilitate public school desegregation . The 
results are still 'ng used to design current programs. Most of these studies were 
seriously flawed, iiowever, their results were at best incomplete and at worst totally 
wrong. 

We know that students arc morc q)t to be with, talk to, and make friends with 
students of their own sex than todo so withmembersof theodier sex. We also know 
that the interactk>ns boys have with other boys differ from those girls have with 
other girls and diffier as well from the interactions between girls and boys (Best, 
1983). The few studies that have looked at both race and sex differcnces in student 
interaction suggest that boys arc morc apt to interact with boys from different races 
than girls arc to interact with girls from differcnt races (Schofickl and Sanger, 
1977). Yet most of the studies of interracial interactk)n did not even indicate 
the sex of the students being studied, let ak)ne k)ok at any effects that the sex of the 
students may have had on results (Slavin and Madden, 1979; Weinberg, 1977). 

Those studies that looked at race and sex tended to examine the interactions of 
girls with girls and boys with boys. The exceptions, studies that looked at the 
interaction of race and sex, found that sex, not race, was the best predictor of 
interaction. Upper-elementary students werc morc q)t to talk with and work with 
same-sex students of a diff^nt race than with different-sex students of the same 
race (Campbell, 1980). Regardless of race, same-sex interactions werc morc 
positive than cross-sex interactions werc. At the same time, girl-boy interactions 
werc morc iqH to be negative than same-sex interactions werc, rcgsffdless of race 
(Campbell, 1980). Sex, not race, was the important tactor. 

What has been rcported as racial isolation may be racial isolation or it may be 
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that the students of color being studied were more apt to be of one sex and the white 
students of the other sex. Since girls have been found to have fewer acquaintances 
than boys and to be more Lkely to choose same-race girls as fnends, differences in 
inter actions may be related to the propcmion of girls and boys being studied. 

The interaction of race and sex is rarely studied ot even considered, a fact that 
may be due to society's discomfort with interracial girl/boy relationships (Wein- 
berg, 1977). Regardless of why the interaction of race and sex has not be^ studied, 
ignoring it has contributed to our lacl of information about racial isolation in 
schools, the degree to which it exists, and what can be done to reduce it 

Many other variables that can affec t student in teraction have also been ignored. 
Social class, f(x example* has been found to be an important component of student 
interaction, at least at the high school level (Petroni, Hirsch, and Petroni, 1970). 
Thus if the socioeconomic statuf> of Black ^tudents and white students isdifferent— 
which in many integrated high schools is jie case — then at least some of what is 
described as racial isolation may actually be attributed to class differences. 

In addition, work on interracial intmK:tion among students has been done 
primarily on B lacks and whites and generalized to others. It is not realistic to expect 
that relationships between Blacks and whites will be identical or even similar to 
those between (a) whites and other students of color, (b) recent immigrants and the 
native bom, or (c) those whose native language is English and those whose native 
language is not Yet when results of studies on Black-white student interaction are 
generalized to student interaction between whites and other students of cotor, that 
is exactly v^' at is being done. 

Because they were based on the availaule research, programs designed to 
encourage multiculturalism and cross-race interactions rarely considered the ef- 
fects of sex and class in their design. Thus the programs are less effective than they 
could be. 

Mathematics AbiUly 

It is widely believed that boys are better than girls in mathematics, and many, 
including some educational researchers, believe there is a bk>logical basis for any 
differences (Benbi^wand Stanley, 1980). Beliefs about genderand math ability play 
a large role in what research is done and what conclusions are drawn. 

Reading research conclusions, or even the preceding paragraph, one would 
assume that the differences between girls' and boys' math abihues are large and 
extensive. This isn't the case. Some studies have found sex differences in various 
mathematical areas; other studies have not And when differences have been found, 
they are usually small Many girls have higher math skills than most boys do, and 
many boys have lower math skills than most girls do. Differences within groups of 
giris or groups of boys are much greater than differences between the "average" girl 
and the "average" boy. \et this distinction is rarely made in the research or in 
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discussions of the implications of sex differences in mathematics for teaching 
strategies. 

In the past, most studies of sex differences in mathematics either gave 
standardized math tests to large numbm of students or used student math scores that 
had been coUected for oiher purposes, such as the S AT:M scores or those coUected 
loi the National Assessment of Educational Progress. Most of these analyses 
showed boys having higher scores. 

A comparison of girls* and boys* test scores does not mean and cannot mean 
that being female (or male) "caused** the scores to be different Since the researcher 
could not assign subjects to be girls or to be boys, no one can be sure that the only 
difference between the groups being studied was their sex. There are many 
differences in the experiences of girls and boys that could affect their math 
achievement, including differences in the number and types of math courses girls 
and boys take. 

There are sex differences in math courses taken. Boys generally take more and 
higher level math courses than girls do (Becker and Jacobs, 1983). This factor, of 
course, affects the amount of math they know and their math test scores (Jones, 
1986). Before taking a geometry course, boys in seventy-four schools were found 
to have better geometry skills than girls had. However, when students were retested 
after taking the course, no sex differences were found in geometrv skills (Senk and 
Usiskin, 1983). 

The effects of such variables as differential course-takingmust be investigated 
before any conclusions can be drawn on the cause, or even the existence, of sex 
differences in mathematics. Yet, because so many researchers have been sure that 
sex differences exist and are "natural,** differences in girls* and boys* experiences 
are rarely examined or even mentioned as a possible factor in sex differences. In 
fact, when the number of math courses a student ha^ had is taken into account, sex 
differences in mathematics are reduced or eliminated (Jones, 1986; Pallas and 
Alexander, 1983). 

In a very well publicized exception, Benbow and Stanley compared math 
achievement on the S AT:M for gifted female and male seventh-graders and found 
males scoring higher. Since female and male seventh-graders take the same math 
courses, the researchers concluded that the differences came from "superior male 
mathematical ability** and suggested a genetk/biological reason for the differences 
they found (Benbow and Stanley, 1980). Furthermore, although only academically 
gifted students were studied, Benbow and Stanley concluded, with no evidence, that 
their results would be observed even if a broader population were studied. 

Benbow and Stanley*s conclusions were reported in Newsweek, on NBC*s 
"Today,** and in newspapers nationally. From thesereports many people, including 
educators and parents, concluded that sex diffmnces in math achievement were 
genetic. On the one hand, mothers who had heard about the Benbow and Stanley 
study had lower expectations of their daughters* aptitude for and achievement in 
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mathematics; on the other hand, fathers who had heard about the study were more 
apt to think math was important for their daughters (Jacobs and Eccles, 198S). 

While Benbow and Stanley did control for math courses taken, many other 
factors were not mentioned, much less taken into consideration. Even when girls 
and boys take the same math courses, their experiences and the encouragement they 
rec^jve in class are quite different; boys, for instance, generally receive more praise 
and attention, which reinforces the concept of math as a male domain (Becker, 
1981). Girls and boys have different math experiences and encouragement outside 
school as well. Boys' play experiences, for example, provide boys with more 
opportunities to develi^ and improve spatial skills (Greenberg, 1978). 

The test used to analyze math ability also has an impact on research results. As 
indicated earlier, test developers have found that girls tend to score higher on essay 
and fill-in-the-blank questions, while boys tend to score higher on multiple-choice 
questions. Thus sex diTfeiences can be increased or decreased by changing the types 
of items on a test (Dwyer, 1976). Yet the type of test items is rarely considered, 
controlled for, or even mentioned in studies of sex differences in mathematics (or 
in other areas). _ 

Test content also affects results. Many of the studies o^ sex differences in 
mathematics use the SAT:M. In 1971, Dr. Thomas Donton of the Educational 
Testing Service stated^ihar although males scored about 40 points higho* than 
females on this test, the difference could be cut in half by having itemscover subject 
matter that was more familiar to females. Today this^^c_diffeien^ still be 
greatly reduced using the same technique. 

Most researchers know that difF^mnt math achievement tests find greater, 
lesser, or no sex differences. The Benbow and Stanley study used the SAT:M to 
measure sex differences even though the authors were aware that this test finds sex 
differences "both early and late,** whereas other tests such as the School and CoUege 
Aptitude Test*-Quantitative do not detect an early sex difference (Benbow and 
Stanley, 1980, 1983). It is conceivable that they chose the SAT:M because they 
believed there were sex differences in mathematical ability. One wonders what an 
"objective" researcher would have done. 

It is impotant, and possible, to account for differences in course-taking 
behav or; it is also possible to choose a test that minimizes sex differences in math. 
It is equally important but much moie difficult to account for different treatment in 
the same math classes and for different experiences outside school. Whether or not 
such areas can be controlled in the research, readers need be made aware of them. 
Yetthese issues and supporting research results have not been cited ip national news 
magazines, nor have authors of studies in these areas appeared on national 
television. It may be that these results did not receive the publicity accorded 
Benbow and Stanley's work becau.'se the latter study reinforced the stereotype that 
boys are naturally better in mathematics, whereas the others challenged it 

Work on racial differences in mathematics is subject to similar caveats. The 
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relationship between the number of math courses taken and math achievement holds 
true for cross-race as well as for cross-sex studies. Studies have found the number 
of math courses taken to be a better (Miedictor of math achievement thnn such 
variables as parent occupation, mean grade-point avmge, ot racial/ethnic back- 
ground (Jones, Burton.and Davenport, 1984; Wek:h,Andersson,and Harris, 1982). 
For example, studies of math differences between Blacks and whites frequently 
don't, but should, reflect the smaller number of Blacks taking advanced math 
classes and the fewer number of math courses they have taken. 

There may be sex and race differences in math achievement, but at this point 
we don' t know how large or how firm these diffmnces are or what their causes are. 
Because of bias in research, it is diffic«fH to know even whether the differences are 
real, let atone wheth^ they are large enough to be of concern. In addition, without 
adequate information on causes, it is v^ difficult to det^tnine the types of 
programs that need to be developed. Should efforts, forexample, focus on affective 
issues (such as encouraging people of color and young women to take more matfi 
courses) or on cognitive issues (such as providing more experience in spatial skills)? 
Good research is a precursor to effective efforts to provide equal learning opportu- 
nities to all students. 





Some Guidelines 
on Bias in Research 



Yesterday and today, biased beliefs about female and male students and about 
students of color and white students have affected research results and thus 
influenced the educational decision-making process. The following guidelines may 
prove helpful in assessingresearch for bias, reducing its negativeeffects, and making 
educational research more valuable to policymakers and practitioners. 

Guidelines for Evaluating Research Ibr Bias 

1. Can you tell the author's opinions or biases as you read the study? For 
example, the authw^s bias is clear when a study is done to determine the 
negative influence of mothers* employment on children*s achievement. 

2. Do authors use different words depending on the sex (h* race of those being 
studied? For example, if studies of father absence are labeled "father 
absence** while studies of mother absence are labeled "maternal depriva- 
tion,** bias is present 

3. How is racial or ethnic groiq) membership deflned? For example, if genetic 
differences between Blacks and whites are concluded when no geneiic 
definitions oi Black or white are given, the study is biased. 

4. Are the tests used "fiui^? Does the study indicate whether the tests were 
developed and used with females anc^ males from a variety of racial and 
cultural backgrounds? 

5. Does the study describe who is being studied, including their sex and race? 

6. Are the results of the study apiriied only to people like those studied or aie 
they overgeneralized to include others? For instance, aie people of color 
included in conclusions when only whites were studied? 

7. Are sex and race similarities as well as differences reported? 

8. Are the conclusions based on the author*s results or on the author*s 
expectations? 

Guidelines for Reducing Bias When Conducting Research 

1 . The research design should account for confounding variables related to the 
race and sex of subjects. Researchers should determine the validity of 
measures of independent variables dealing with race and sex. Independent 
variables that defme one group in terms of another, such as basing a 
woman *s socioeconomic status on that of her husband or father, should not 
be used. 
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2. Areview of the lesearch literatureshould include acri tical analysis of prior 
leseaich, with information on important characteristics of the groups 
beingstudied. Researchmshouldexpand their literature seat/:h to include 
puUications focusing on women and people of color (e.g., Psychology of 
Women Quarterly srd Black Education), Any literature cited in aresearch 
article should be subject to critical assessment that includes guidelines on 
research bi-^. Mention should be made of any major weaknesses. 

3. Unless there isademonstrablerationale forrestricting asample toonerace 
or sex, samples should be multiracial and include both females and males. 
In the develq)ment and testing of models, researchers should include 
women and people of color, rath^ than using them, post hoc, to Investigate 
how well they fit existing models devised from white male samples. If 
samples are not multiracial or do not include females and males, then a 
justification for the makeup of the sample should be made and the results 
should not be generalized to groups unrepresented in the sample. A 
sample's race and sex characteristics should be described in any tepon of 
the research. 

4. Only tests that are not biased for or against the groups being studied should 
be used. In developing or selecting tests for research, researches should 
avoid tests that 

• use exclusionary language or other offensive language or questions 

• do not include materials relevant to women and people of color 

• give no evidence of validity for the individual groups being tested 

5. Researchers should control for possible effects of observers' perceptions 
of "^ropriate** behaviors for subjects from different racial and gender 
groups. If possible, researchers should mask the sex of yoimg subjects. 
Observers should be made aware of the effects that stereotypic expecta- 
tions may liave on their ratings. Sex and race differences found through 
observation should be substantiated using other methods of data collec- 
tion. 

6. Conclusions should be referenced directly to the results of the study. 
Nonstereotyped as well as more traditional explanations of results should 
be explored If, for example, nonsignificant differences are found, they 
should not be reported simply as differences, and when sex and race 
differences arc found, a variety of possible explanations for them should 
be considered. 



Making Research Better 
Next Steps 



1 . Apply the suggestions and guidelines included in this monograph to your 
own actions. 

• Don ' t makedecisions based on what "research says" until you check the 
results for general accuracy and bias. 

• Don ' t pass on "facts** without checking on iheir accuracy. 

• Read the entire research study instead of just the beginning and end. 

• Use the same criteria to evaluate studies whose results you feel must be 
right as you do to evaluate studies whose results you feel can' t be right. 

2. Make others more aware of bias in research and its effects on education and 
other areas. Discuss issues related to bias in research with your students 
and colleagues. Consider distributing copies of the brochures that accom- 
pany this monograph (available from the WEEA Publish^ g Center, EDC, 
55 Chapel St, Newton, MA Q2160). These brochures were designed 
specifically for teachers, administrators, coun2.elors, students , parents , and 
others interested in educaaon. 

3. When you And a study that is biased, do something about it Write down 
what aspects of the study you think are biased and why; note what effect 
you think this bias may have had on the study 's results. Write to the editor 
of the journal or magazine in whk;h the research was published and 
describe your concerns. Send a copy of your letter to the author as well. 
Suggest that the journal include information on sex and race bias in 
research in the guidelines it provides to its authors and reviewers. Offer 
to become a reviewer yourself. 

4. Professional organizations are paying rxorc attention to standards and 
guidelines to increase the quality of research and evaluation. Some 
guidelines, such as Standards for Evaluation of Educational Programs, 
Projects, and Materials (The Joint Committee on Standards for Educa- 
tional Evaluation, 1981), do not deal widi issues of bias at all; other 
guidelines, such as those of the American Psychological Association, 
cover the effects only of sex bias on research; and still other guidelines, 
such as those of the American Educational Research Association, include 
the effects of both race bias and sex bias on research. Find out what steps, 
if any, your professional organizations are taking to address bias in 
research, and encourage them to develop, approve, and use guidelines to 
reduce bias in research. 
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5. Find out more about bias in research. The following lefoences can 
provide additional information: 

• CampbeU, P.B. (1983), Racism and sexism in researcii methods. Ency- 
clopedia of Educational Research, New Yoit: Macmillan, 

• Ehrenreich,B..&EngIish,D. (1 979). F()r//frOwwC«)d.Ganlen City: 
Anchor. 

• GouM, S J. (1981), The MismeasurecfMan, New Yoik: W.W, Norton 
and Company, 

• Thomas, A., & SiUen, S, (1972), Racism and Psychiatry, I w Yoiic: 
Brunner/Mazel, 

6, Find out more about research in general. The Appendix, "A Beginner's 
Guide to Educational Research is a good place to start 



Appendix 



A Beginner's Guide to 
Educational Research 

What b Research? 

Research , according to Webster's, is an "investigation or experimentation aimed at 
the discovery and interpretation of facts.** In education there arc two majw types 
of research: basic research, in which the goal is a better understanding of learning 
and the educational process, and applied research, which focuses on finding infor- 
mation that will improve current educational practice. 

The Research Method 

Educational research , like research in other areas, relies heavily on what is known 
as the scientific method. This is a system of investigation that typically invdves the 
following steps: 

• development and statement c^f a question or a |HoUem to study 

• formulation of a hypothesis (a "best-guess** answer tc the question being 
studied, based on existing theory and research) 

• development and implementation of a structured plan (or design) to test the 
accuracy of the hypothesis or to answer the questions posed 

• determination of the results of the plan 

• generation of conclusions based on research results and the development of 
further research questions based on both the results and the conclusions 

TheSampk 

In research, those being studied are called participants, or subjects. A group of 
subjects is called the sample. The sample is supposed to be representative of a 
population^ a larger group to whom the results of the research can be q^lied (i.e., 
the results of a study on a sample are generalized to the population that sample 
represents). For example, if you selected ten students ftom each of ten classes for 
a research pioject, each of the students would be a subject, the one hundred students 
would be your sample, and the ten classes would be the population to which your 
results could be applied. 
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The Design 

Both basic research and applied research can be done using one of several di fTerent 
designs, or plans of action. These plans are based on a variety of factors, including 
the type of research being done and the resources available. 

The Experimental Design. If one wishes to determine whether something "causes** 
something else, the experimental design is the most effective. In this design, one 
OT more groups, called the experimental groupis), receive some sort of treatment 
(e. g., a new reading program , a new tuttning jxogram , or smaller class size) while 
a similar group, the control group, receives no treatment The essence of the 
experimental design— that which makes it the "best** design for causal research- 
is that each of the subjects studied has an equal chance of being selected for either 
the experimental or the control group. Use of this design increases the chance that 
the only difference bev A^een the experimental group and the control group wiU be 
that one receives the treatment and one doesn't Thus if differences show up 
between the experimental and control groups, those differences can be said to be 
caused by the treatment For example, if the one hundred students in our sample are 
selected by c hance to go either into a group that receives money for getting an A or 
into a group that receives no nraney for getting an A, then we have an experimental 
design. If the subjects receiving money get higher grades than those who don't 
receive money, then, because it is an experimental study, we can say that, for that 
sample, receiving moriey for good grades imimves student grades. 

The Ex Post Facto Design. Experinrtental designs are not always ap(Hopriate. For 
instance, because people cannot be randomly assigned to be female or male or to be 
Black or white, an experimental design cannot be used to study sex or race. Neither 
can an experimental design be used to study something that has already occurred, 
because it is then too late to randomly assign subjects. 

For these kinds of studies mexpostfacto, or quasi-experimental design can be 
used £xp05//£jc/& is aLatin expression meaning "aft^ the fact** Inanexpostfacto 
study, the researcher does not control who is in the experimental group and who is 
in the control group. Therefcve, it is not possible to be sure that the only difference 
between the groups being studied is the treatment or to conclude that the treatment 
"caused** any differences in the groiq). 

For example, a researcher studying young children at play might find that girls 
and boys have different pattmis of pli^. The researcher could conclude that girls 
and boys play differently but could not conclude that being a girl or being a boy 
"caused** the children to play differently. There are many other variables that might 
account for the differences. These variables might include that all the boys are 
wearing pants while a number of the girls are wearing dresses; that teachers 
frequently give differ^t instructions and play suggestions to girls versus boys; or 
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that parents are more apt to be concerned about girls "keeping clean** than about 
boys doing so. 

Post Hoc Fallacy, Drawing invalid conclusions based on ex post facto research is 
so prevalent that researchers have a special name for it: a post hoc fallacy. Since 
by definition all research comparing females with males, pe(q)le of color with 
whites, and disabled with able-bodied persons is ex post facto, it is particularly 
important to check for post hoc fallacies in such studies. You should suq)ect the 
existence of a post hoc fallacy whenever a study concludes that being female or 
Black or disaUed, for example, causes something t « hq)pen, whatever that 
something might be. 

Other Designs. There are a number of oth^ ways that research can be done, 
including the following: 

• Survey, or descriptive, research, in which there is no treatment and subjects 
respond to a series of written or oral questions describing a situation or area 
of interest A suidy of student attitudes toward school would be an example 
of survey research. 

» Qualitative, or naturalistic, research, in which the researcher observes 
peq>le in a natural setting and, over a period of time, almost becomes a part 
of a group in order to be able to analyze group processes and interactions. A 
study of how fourth-graders* behavior changes in rrms of how fourth- 
gr&ders interact with the teacher during the school year woukl be an example 
of a qualitative study. 

• Correlational research, in which the degree to which changes in one variable 
are reflected in changes in one or more other variables. A study of the 
relatioisship between achievement test scores and grades v^uld be an 
example of correlational research. 

• Historical research, in which analysis is based on documents and data from 
the past A study of the different ways that reading taught in the 
nineteenth century would be an example of historical research. 

It is important to note that regardless of the research design used, if there is ru) 
random assignment of subjects to the treatment, then you cannot be sure that the 
treatment caused any differences. 

Sources of Invaluuty 

Obviously, the quality of research— its validity, the degree to which results are 
accurate and can be attributed to that which is being studied— ic very important 
Researchers have long been concerned about the validity of their work and have 
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attempted to design studies that control for as many sources of invalidity as possible, 
even though few control for or even consider societal biases. The following is a list 
of the more common sources of invalidity not related to soci'^tal biases. 

• The Hawthorne Effe Being studied and getting extra attention, being 
"special,** may be enough to cause changes in the subjects independent of 
what is being studied. The original Hawthorne study was done with factory 
workers. Researchers found that productivity increased when they did 
positive things (increased light, increased breaks); they also found that 
productivity increased when they dkl negative things (increased room 
temperature). Further work found that it was the increased attention that 
raised imxluctivity. Using a second group that gets the attention but not the 
treatment controls for the Hawthorne Effect 

• Maturation. Just growing older can have a strong influence on subjects, 
particularly if young children are being studied. For example, if researchers 
are studying the effects of a year-long program on children's language 
develqxnent, they must remember that children's language skills will 
improve in a year regardless of the program used. Without a same-age 
control group, the researcher will not know how much of a Ciuuige is caused 
by that which is being studied and how much is caused by the subjects* 
getting older. 

• Testing, Testingcanaffectastudyin many ways. Obviously, if atest doesn't 
measure what it is « pposed to measure, then results will be incorrect In 
addition, taking a tcbv can affect subjects; changes in subjects fr^vy be due to 
thetestratherthanthetreatment For example, thepracticeoriak^ ^-^ .[Hetest 
on fractiok:!: might do more to increase students* abilities to work with 
fractions than the treatment does. Finding tests that have been found to be 
valid (that do measure what they say they measure) and using a control group 
that takes the tests but not the treatment are ways of controlling for the 
influence of testing 

^ History. In an klealrtudy, the only difference between groups being sttidied 
is the treaunent; however, during a stiHy. groups may have different 
experiences (a teacher might get skk, a sci j1 might start a new project). By 
the end of the research period, the different histories of the groups, rather tii^Ji 
that which is being sttidied, might be the cause of any chaises. It is very 
difficult tocontrol for history; being aware of the unexpected and unintended 
events that occurred and reporting them in the results are ahout all that can 
be done. 



Wait^ Even though your first impulse may Ik ? skip this sec^on. read on. Many 
of usareafTai^! ')fstatistics and convinced that we can never understand them. Tnat 
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does not have to be the case. Even though most of us will never become theoretical 
or even q>plied statisticians, we can, with a little effort, learn enough to begin to 
make sense of the statistical section of a piece of research. Always keq) in mind that 
statisticsare justa way of reducing large amounts of information (test scores, height, 
attitudes, rankings, almost anything) into summaries uiat provide useful informa- 
tion. 

There ^ two basic types of statistics: descriptive and inferential. 
Descriptive Statistics 

Descriptive statistics as their name imp!les, reduce and describe a large amount of 
information. Typical descriptive statistics include the m^on (avenge), (most 
frequent score), and median (score at which half the scores are below and half are 
above). The standard deviation is the measure of how varied or spread out a set of 
scores is. If a set of scores has a mean of 1 0 and a standard deviation of 1 , most of 
the scores in that set will be very close to one another and to the mean of 1 0. About 
two-th irds of the scores will be between 9 and 1 1 . A group whose scores are close 
together is called homogeneous. A group in which the scores are much mere spread 
out, where the standard deviation is much laiger, is called heterogeneous. For 
example, a set of scores witii a mean of 10 and a standard deviation of S is more 
spread out than the first example (it is therefore heterogeneous), jbout two-thirds 
of this group's seems woukl be between S and IS. 

Other descriptive statistics include stanine xentiles, and standat d scores. 
Stanines break the .distribution of scores into nil. j sections 'l=lowest, 9=highest) 
and indicate in which of the nine sections an indi «ridual score falls. Percentiles (firom 
0 to 99) describe the percentage of scores that are lower than an individual score. 
Standard scores describe how far an individual score is from the mean score: if 0 
rq)resents the mean, then a standard score of 1 .5 means that the individual score is 
one and one-half standard deviations above the mean, whereas a score of minus 2 
means a score is two standard deviations below the mean. 

Irrferential Statistics 

Whei^eas descriptive statistics describe what the information is, itferential statistics 
tell what can be imp^^ed from that information. Inferential statistics tell us the odds 
in which the differt>«ces between groups can be attributed to chance or are real and 
could be replicated (than is. if the study were done again, the results would be 
similar). 

Most researchers feel that they have to be at least 95 peix:ent certain that 
diflierences between groups are real before they are willing to say so. This is known 
as the levef of probability or significance and is generally shown as p< .OS, meaning 
that the differences between groups were large enough that the chances are better 
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than 95 out of 100 that the --Mdy could be rci^icated with similar results. The 
statement(p<.05) is considered an accqMablelevelofrisk, and dif^^ 
groiq)satthep< .05 Icvelareconsidw^statisiically significant. If the study reports 
(p< .01). then thechances are belter than 99 outof 100; (p<.001) means thechances 
are better than 999 out of 1000; and soon. 

Significant Differences 

There are two types of significant differences: statistical and practical. Statistical 
significance means that the differences between or among groups arc most likely 
real and would be found if tic study were replicated. However, just because a 
difference is statistically significant does not necessarily mean that it has practical 
meaning. Statistical significance is related to a nun>ber of things, including the size 
of the differences between groups, the number of subjects, and the degree that the 
scores in each ffxnsp are spread out For example, a study of 1 0.000 students might 
fmd that students using Math Book A increased their math achievement 1 perccnr 
more than students using Math Book B did. This difference would be statistically 
significant, meaning that the differences were most likely real and not due to chance. 
A teacher choosing a math book, however, could and most likely would say that a 
1 percent difference was not meaningful; with such a small difference, factors such 
as cost and ease of use would and should be the deciding factors in book selection. 
Thus the statistically significant difference would have no piactiral significance. 

This hasbeen only an introduction to research methods; there is muchmorc to leam. 
For additional information. seeF. KtrUnger's Foundations of Behavioral Research 
(New Yoric Macmillan. 1972) or any of the many educational research book5 
available in local college or public libraries. 
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CODE NO : 0691 



We all know that sex and race bias in research h- teen a fact of history. 
In The Hidden Discriminator: Sex and Race Bias in Educational Research, 
author Patricia 3. Campbell shows that bias in educational research is still 
alive and «vell. Using examples both from the past and the present, Campbell 
examines the myriad ways that bias can affect language: its influence on 
the researchers, the groups selected for study* the questions that are asked, 
the way the study is done, and the conclusions that are drawn. The result 
is a startling picture ' ^ the influences that shape the research that we use 
and that is used on us. 

In language that is both lively and down-to-earth, Campbell not only 
describes the problem, but provides easy-to-follow guidelines for evaluating 
research and identifies "next steps" for reducing the incidence and effects 
of bias in research. An educational tool that shows us we can't responsibly 
accept research without asking some basic questions about it« this unique 
book is a must for all those who use or are affected by educational research, 
including teachers, parents, administrators, counselors, and students. 

"The Hidden Discriminator provides an excellent discussion of race 
and sex bias in research methods currently not In textbooks on 
research." 

- Charol Shakesfiaft, Hofstra University 
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